Master’s Degree in Data Science

I am an international student from Mexico.

Filtering Columns and Rows

When using dplyr, you can filter columns and rows by using select and filter. Let’s look at an example using the Lahman baseball database. We first have to load the Lahman and dyplr packages.

library(Lahman)
library(dplyr)

Now, suppose we would like to see the homerun totals for the 1927 New York Yankees. We could run the following code:

Batting%>%
  select(playerID,yearID,teamID,HR)%>%
  filter(teamID=='NYA' & yearID=='1927')
##     playerID yearID teamID HR
## 1  beallwa01   1927    NYA  0
## 2  bengobe01   1927    NYA  0
## 3  collipa01   1927    NYA  7
## 4  combsea01   1927    NYA  6
## 5  duganjo01   1927    NYA  2
## 6  durstce01   1927    NYA  0
## 7  gazelmi01   1927    NYA  0
## 8  gehrilo01   1927    NYA 47
## 9  giardjo01   1927    NYA  0
## 10 grabojo01   1927    NYA  0
## 11  hoytwa01   1927    NYA  0
## 12 koenima01   1927    NYA  3
## 13 lazzeto01   1927    NYA 18
## 14 meusebo01   1927    NYA  8
## 15 moorewi01   1927    NYA  1
## 16 morehra01   1927    NYA  1
## 17 paschbe01   1927    NYA  2
## 18 pennohe01   1927    NYA  0
## 19 pipgrge01   1927    NYA  1
## 20 ruethdu01   1927    NYA  1
## 21  ruthba01   1927    NYA 60
## 22 shawkbo01   1927    NYA  0
## 23 shockur01   1927    NYA  0
## 24 thomamy01   1927    NYA  0
## 25  weraju01   1927    NYA  1